-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core(driver): handle promise rejections from sendCommand #6739
Conversation
@@ -893,8 +894,8 @@ class Driver { | |||
// These can 'race' and that's OK. | |||
// We don't want to wait for Page.navigate's resolution, as it can now | |||
// happen _after_ onload: https://crbug.com/768961 | |||
this.sendCommand('Page.enable'); | |||
this.sendCommand('Emulation.setScriptExecutionDisabled', {value: disableJS}); | |||
await this.sendCommand('Page.enable'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay to await these? The comments above them talk about racing like this might be purposefully not-awaited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug is marked as fixed now. If it truly is, can you also remove the comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the big bug was definitely about not awaiting Page.navigate
, but why are we needing to await
these? is it just about unhandled rejections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could still timeout on any protocol command. what happens in the underlying connection dies? we should avoid any possible unhandled rejection.
But await
each command isn't the only way to avoid unhandled rejections here. wdyt of just tacking a catch
on Page.enable, or combining the two commands like this:
await Promise.all([
this.sendCommand('Page.enable'),
this.sendCommand('Emulation.setScriptExecutionDisabled', {value: disableJS}),
])
so that it's one less event loop iteration ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right I'm onboard with that, but I'm asking because there are then other things we might try instead of serialling awaiting like this._sendAndSilentlyIgnoreErrors
or something :)
It looks like we used to be just fine waiting on these in sequence though, and the Page.navigate
change just moved all of 'em to be together, so we don't need to go crazy avoiding the await
#3413
I'm a tad spooked by this given that it was a real nasty bug to hunt down, but I think this makes sense if the bug really is fixed. +1 to removing the comments if we're making this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_innerSendCommand
is specifically for when you don't want to timeout.
I'm not sure if _innerSendCommand
can ever reject. You could try setting a breakpoint just before, then kill Chrome, then hit play, and see what happens ... But if I read the code correctly for how the connection works, it won't ever result in a rejected promise - just a stalled one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or I guess sendCommand
is never the real problem it's always the PROTOCOL_TIMEOUT that's the real problem?
You guys beat me to the punch you're already on it 👍 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I'm not following the conversation correctly here, but _innerSendCommand
can reject via connection
:
lighthouse/lighthouse-core/gather/connections/connection.js
Lines 120 to 124 in 9ebf069
return callback.resolve(Promise.resolve().then(_ => { | |
if (object.error) { | |
log.formatProtocol('method <= browser ERR', {method: callback.method}, 'error'); | |
throw LHError.fromProtocolMessage(callback.method, object.error); | |
} |
(throwing inside Promise.resolve().then(_ => {/* throw */})
rejects it, then callback.resolve()
with a rejected promise rejects as well. We could probably make that a lot more understandable with an explicit callback.resolve
or callback.reject
, as well as using an async
function for the inner part)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so do we want to add an empty catch
to the this._innerSendCommand('Page.navigate', {url});
line? Or are we assuming it's so rare that it's ok that a unhandledRejection
listener can catch it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it seems like this never happens, should we just let sleeping dragons lie?
@@ -893,8 +894,8 @@ class Driver { | |||
// These can 'race' and that's OK. | |||
// We don't want to wait for Page.navigate's resolution, as it can now | |||
// happen _after_ onload: https://crbug.com/768961 | |||
this.sendCommand('Page.enable'); | |||
this.sendCommand('Emulation.setScriptExecutionDisabled', {value: disableJS}); | |||
await this.sendCommand('Page.enable'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug is marked as fixed now. If it truly is, can you also remove the comments?
lighthouse-core/gather/driver.js
Outdated
} | ||
|
||
/** @type {(() => void)} */ | ||
let cancel = () => { | ||
throw new Error('_waitForCPUIdle.cancel() called before it was defined'); | ||
}; | ||
const promise = new Promise((resolve, reject) => { | ||
checkForQuiet(this, resolve); | ||
checkForQuiet(this, resolve, reject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if reject is called twice? is that possible here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, once a promise has been resolved/rejected all further calls will be noops
@@ -893,8 +894,8 @@ class Driver { | |||
// These can 'race' and that's OK. | |||
// We don't want to wait for Page.navigate's resolution, as it can now | |||
// happen _after_ onload: https://crbug.com/768961 | |||
this.sendCommand('Page.enable'); | |||
this.sendCommand('Emulation.setScriptExecutionDisabled', {value: disableJS}); | |||
await this.sendCommand('Page.enable'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the big bug was definitely about not awaiting Page.navigate
, but why are we needing to await
these? is it just about unhandled rejections?
@@ -893,8 +894,8 @@ class Driver { | |||
// These can 'race' and that's OK. | |||
// We don't want to wait for Page.navigate's resolution, as it can now | |||
// happen _after_ onload: https://crbug.com/768961 | |||
this.sendCommand('Page.enable'); | |||
this.sendCommand('Emulation.setScriptExecutionDisabled', {value: disableJS}); | |||
await this.sendCommand('Page.enable'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or I guess sendCommand
is never the real problem it's always the PROTOCOL_TIMEOUT that's the real problem?
You guys beat me to the punch you're already on it 👍 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3413 also provides a repro step. Updated for the current cli flags, that would be yarn start ----emulated-form-factor=none --throttling-method=provided http://example.com --verbose
, run against this PR and not seeing a long timeout pause.
However the main solution from #3413 seems to just be not waiting on Page.navigate
. This still doesn't do that, so that shouldn't be a problem even if the underlying crbug isn't actually fixed.
@@ -893,8 +894,8 @@ class Driver { | |||
// These can 'race' and that's OK. | |||
// We don't want to wait for Page.navigate's resolution, as it can now | |||
// happen _after_ onload: https://crbug.com/768961 | |||
this.sendCommand('Page.enable'); | |||
this.sendCommand('Emulation.setScriptExecutionDisabled', {value: disableJS}); | |||
await this.sendCommand('Page.enable'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so do we want to add an empty catch
to the this._innerSendCommand('Page.navigate', {url});
line? Or are we assuming it's so rare that it's ok that a unhandledRejection
listener can catch it?
lighthouse-core/gather/driver.js
Outdated
this.sendCommand('Page.enable'); | ||
this.sendCommand('Emulation.setScriptExecutionDisabled', {value: disableJS}); | ||
await this.sendCommand('Page.enable'); | ||
await this.sendCommand('Emulation.setScriptExecutionDisabled', {value: disableJS}); | ||
// No timeout needed for Page.navigate. See #6413. | ||
this._innerSendCommand('Page.navigate', {url}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, after talking this through for a little while, we think we do want to incorporate the Page.navigate
errors into the promise chain. It's a little ugly, so if anyone has any other structure ideas, please chime in :)
Basically it would be something like
const waitforPageNavigateCmd = this._innerSendCommand('Page.navigate', {url});
// ...
let pageWaiter = Promise.resolve();
if (waitForNavigated) {
pageWaiter = this._waitForFrameNavigated();
} else if (waitForLoad) {
// ...
pageWaiter = this._waitForFullyLoaded(/* ... */);
}
await Promise.all([
waitforPageNavigateCmd,
pageWaiter,
]);
- If
Page.navigate
has no problem, as before it doesn't block or interfere with observing the page load - If
Page.navigate
throws, it wins thePromise.all
with a rejection, which leads to lighthouse exit.
The main issue I see (besides reduced readability) is that we don't clean up the _waitForFullyLoaded
machinery. This is a fatal error so it doesn't matter in most clients, but are we going to have to actually clean up to support when LH is run as a module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also during error testing we discovered that _waitForFrameNavigated
should probably also have a timeout instead of blindly waiting on the Page.frameNavigated
event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we discovered that _waitForFrameNavigated should probably also have a timeout
WDYT about having an option to this.once
that does the timeout bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that we don't clean up the _waitForFullyLoaded machinery
why don't we just await waitForPageNavigateCmd
as the ultimate very last thing? the only thing we want to happen is get the errors into the chain, right? if we do it as the very last thing, then it can't prevent other cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about having an option to this.once that does the timeout bit?
yeah, good idea. We have at least one or two places we will happily wait forever on an event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though the service worker one I'm thinking of only waits forever on about:blank
, which adding an _waitForFrameNavigated
timeout will fix :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we just await
waitForPageNavigateCmd
as the ultimate very last thing? the only thing we want to happen is get the errors into the chain, right? if we do it as the very last thing, then it can't prevent other cleanup
This would definitely be easier to follow.
The ordering will be a little wonky (I'd imagine a Page.navigate
failure would end up usually failing on another command within _waitForFullyLoaded
or by hitting the 45 second timeout), but ordering is already an issue with Promise.all()
(and without cancelling _waitForFullyLoaded
), and we've already all agreed that Page.navigate
rarely if ever fails :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just the nit, but otherwise LGTM!
} | ||
|
||
if (waitForLoad) { | ||
} else if (waitForLoad) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
lighthouse-core/gather/driver.js
Outdated
@@ -918,6 +911,9 @@ class Driver { | |||
maxWaitMs); | |||
} | |||
|
|||
// Awaiting so that errors can be properly processed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: while this comment makes total sense in the context of this PR, I can definitely see myself coming back in a year and being like "what?" maybe a link to this PR or a tad more explanation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call. Maybe even just something like // Bring any 'Page.navigate' errors back into the promise chain.
and a link back here.
The comment above the Page.navigate
command isn't terribly helpful either, but I don't have a great suggestion :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % nit!
@@ -918,6 +911,9 @@ class Driver { | |||
maxWaitMs); | |||
} | |||
|
|||
// Awaiting so that errors can be properly processed. | |||
await waitforPageNavigateCmd; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's nothing in _endNetworkStatusMonitoring we need, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's nothing in _endNetworkStatusMonitoring we need, right?
it returns the finalUrl
after redirects, which is used for updating passContext
(and eventually the artifact).
A little too convoluted but somewhat difficult to fix since networkRecords
don't entirely exist by that point.
Summary
As of #6347 driver's
sendCommand
can reject its Promise if the underlying protocol command times out. This new rejection path isn't handled in some invocations of sendCommand leading to unhandled promise rejections. This would present as us "missing" a PROTOCOL_TIMEOUT in LR, leading to the graceful resolve fix push; however, upon investigation, the cause of missing those timeouts was the timeouts causing an unhandled promise rejection which freaks out LR!This should catch all unhandled promise rejections as of right now. I did a lot of testing to try to track down unhandled promise rejection, but I might have missed some, so if you find any lmk.
Related Issues/PRs